1 TTCGCCTGCG GGCCGGCACT GCTCACCTCT CGTCCAGGGA CATGACGGGC 
51 ACGCCAGGCG CCGTTGCCAC CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 
101 GCCCTGCAGT CCGAGCTACG ACCTCACGGG CAAGGTGATG CTTCTGGGAG 
151 ACACAGGCGT CGGCAAAACA TGTTTCCTGA TCCAATTCAA AGACGGGGCC 
201 TTCCTGTCCG GAACCTTCAT AGCCACCGTC GGCATAGACT TCAGGAACAA 
251 GGTGGTGACT GTGGATGGCG TGAGAGTGAA GCTGCAGATC TGGGACACCG 
301 CTGGGCAGGA ACGGTTCCGA AGCGTCACCC ATGCTTATTA CAGAGATGCT 
351 CAGGCCTTGC TTCTGCTGTA TGACATCACC AACAAATCTT CTTTCGACAA 
401 CATCAGGGCC TGGCTCACTG AGATTCATGA GTATGCCCAG AGGGACGTGG 
4 51 TGATCATGCT GCTAGGCAAC AAGGCGGATA TGAGCAGCGA AAGAGTGATC 
501 CGTTCCGAAG ACGGAGAGAC CTTGGCCAGG GAGTACGGTG TTCCCTTCCT 
551 GGAGACCAGC GCCAAGACTG GCATGAATGT GGAGTTAGCC TTTCTGGCCA 
601 TCGCCAAGGA ACTGAAATAC CGGGCCGGGC ATCAGGCGGA TGAGCCCAGC 
651 TTCCAGATCC GAGACTATGT AGAGTCCCAG AAGAAGCGCT CCAGCTGCTG 
701 CTCCTTCATG TGAATCCCAG GGGGCAGAGA GGAGGCTCTG GAGGCACACA 
751 GGATGCAGCC TTCCCCCTCC CAGGCCTGGC TTATTCCAAG AGGCTGAGCC 
801 AATGGGGAGA AAGATGGAGG ACTCACTGCA CAGCCGCTTC CTAGCAGGGA 
851 GCTATACTCC AACTCCTACT TGAGTTCCTG CGGTCTCCCC GCATCCACAG 
901 GGAGGGTAAA ACACTTAGCT TTTATTTTAA TAGTACATAA TTTAATACCA 
951 AAAAAGGCGC CTGGATCCCC AAAAAACCGA GGCTGGGAGC TAGTGGCCCT 
1001 TTTGCTTTCT AGGACTTGGG GGGCCGGCCC TCCCTCCTAA GCATAACAAA 
1051 GGTGGTGTTG CTCCAGCTCA GCCCCAGGGG ACACAGATGC ACTTTGGGGG 
1101 TGAGGGCAGG TAATGACTCC ATCGCACCCT CAGTTCAGCT GGACAGAGGC 
1151 TCAGGTGACC CCAGCCTTCA CTGTCTCCCG CTCTCCAGGA GCTTATCTTC 
1201 GCCCCATCTC CCAAATAAGT GGGCCCTTGT GCTGTGAGGA AGACCAAAGC 
1251 CTCAGGGAAG ATAAGAGATA TGGAGATGGG AGGGGGAGGA CAAGGGGCAG 
1301 AGAGTAGGGT CTAGCTGGCT ATCTCTGGCC TTACTAACAC CCCCCTGGAG 
1351 GCATGCCCCT TTTCTCCAGC ACACAAGCAC ATTGGGGCAC CTGGAAATAT 
1401 TGGTTCCAGG CTCCTGTTCT CTGGACTTCA GATCCTGGGG GAGCCCCTCC 
1451 CCCCCCTGAA TCCCTGGCTT AGCTACCTTC CTGCCTGTGC ACCTAAAAAC 
1501 CTCAGGTCAG AACTAGGAAA AGAGTTTTGT TTTTATTTTT TTGAAATGGA 
1551 GTCTCGTTCT GTCGCCCAGG CTGAGGTGCA GTAGTGCAAT CTCCGCTCAC 
1601 TACAACCTCC ACTCCCTGGG GCTCAAGCGA TCCTCCCACC TCAGCCGCCG 
1651 AAGTAGCTGG GACTATAGGT GTGTACCATC ACACCTGGCT AATTTTTGTA 
17 01 TTTTTTGTAG ACACAGGGTT TCGCCATGTT GCCCAGGCTG GTCTTGAATT 
1751 CCTGAGCTCA AGCAACCTGC CGGCCTCGGC CTCCCAAAGT ACTGGGATTA 
1801 CACGCAGAAG GCACCATGCC CAGGCTAGAT GTGTCTTATC CCAATCCTTT 
1851 GGCAGGCATG CAGCTCCACA GGCGATTTCT TCAAGCAGCT GAAGTGTTTA 
1901 GCCCTCCTGG GTTAAGAGCC AGATAAGGAG AAATCCCTTT CCTAGGTTTG 
1951 GAATGTGTTG TGAAAAAAAA GAGAAATCCC TGGCTCCTGG AGCTGGTGGG 
2001 AGACAAGATT AAGCAAACCT CCCCTGACAT GTATCCCTTT GACCCCAAGC 
2051 TCTGCCTCCT CCCTGACCAC CCATGCCCTT TCCTTTAACT TCTCAAACAG 
2101 ATACCAGGGC CTAAACTGCT TTACCTCCCC TCCTACTGAG TCAGGTTAGG 
2151 TGGTGGGAGG TCACCCATTT CCGAGTTAAA CCAATGCAAT ATGAGTAAAA 
2201 CAAAGTCATG TGGGTATGTC TGGGGTAGAG AGAGGGGTAG CAAGTTCATG 
2251 TGTCCTCCTT GGTCACATAT CTCCCAAAGC TCTGATCCCT GCCATGGGAA 

23 01 GTGGACAGGA AACATGAGGT CATGACCTGC AGGCATCTTT ACTGCAGCTC 
2351 TGCCGGCCTG GAGGGGGAGA GGGGGAGGAA GAAGTATGCG CTGCACATTT 
2 4 01 CTGAGGCTAC TGCATTTGCT TTCAAGGCAG AAATCTTGCT CTGAGCAGTC 

24 51 AGCGGCTCCA GTTTGGGCCC GATAAGGAAG TTCTCCGTGG CCTCCCTCAG 
2501 GCAGAGCAGG GAGGAGGCTG ACATTGCCAG TCTCTTCTGG GGCCCAAGGC 
2551 AGGTTGCAGG AGATCCAATC C CAT AG AC AG CTCTGGGCCT CTTGCATTTG 
2601 AGTTTTTCAG AATTAAACTG CAGTATTTTG GAAAGCAAAA AAAAAAAAAA 
2651 AAAAAAAAAA AAAAAAAAAA AAAA (SEQ ID NO:l) 



FEATURES: 

5'UTR: 1-41 

Start Codon: 42 

Stop Codon: 711 

3 T UTR : 714 
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Homologous proteins : 

Top 10 BLAST Hits 



CRA| 103000001517087 
CRA j 1000682330460 / 
CRA| 18000004977238 
CRA|18000005013109 
CRAI89000000198627 
CRA| 18000005076419 
CRA| 18000004912300 
CRA| 98000043536338 
CRA| 18000004929618 
CRA| 18000004952869 
CRA! 18000005221564 



/ altid=g 
altid=gi j 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/altid=gi 
/ altid^gi 
/altid=gi 



110946770 /def=ref [NP_067386. 1 | RA. 
657492 /def=ref | NP_055168 . 1 | RAB26. 
1710022 /def=sp|P51156jRB26_RAT RA. 
1083775 /def-pir | IJC2528 GTP-bindi . 
7296421 /def=gb |AAF51708.1| (AE003. 
7438397 /def =pir || T15123 hypothetic 
134236 /def=sp|P20791|SAS2_DICDI G. 
12963499 /def =ref | NP_075615 . 1 | cel. 
131798 / def =sp | P24 407 | RAB8_HUMAN R. 
131848 /def=sp|P22128|RAB8_DISOM R. 
4586580 /def =dbj | BAA7 6422.il (AB02. 



BLAST dbEST hits: 



gi 1 13033710 /dataset-dbest /taxon=960. . . 
gi 1 12785775 /dataset=dbest /taxon-960 . . . 
gi 1 12904236 /dataset=dbest /taxon=960 . . . 
gi I 9093496 /dataset=dbest /taxon=9606 . . . 

EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 
From BLAST dbEST hits: 
gi 1 13033710 prostate 
gi | 12785775 brain 

gi 1 12904236 T cells from T cell leukemia 
gi | 9093496 leukopheresis 



Score 
425 
297 
294 
293 
273 
207 
203 
203 
202 
202 
202 



Score 
1318 
1316 
1035 
694 



E 

e-117 
4e-79 
3e-78 
7e-78 
9e-72 
4e-52 
7e-51 
9e-51 
le-50 
2e-50 
2e-50 



E 

0.0 
0.0 
0.0 
0.0 



From tissue screening panels: 
leukocyte 
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1 MTGTPGAVAT RDGEAPERSP PCSPSYDLTG KVMLLGDTGV GKTCFLIQFK 
51 DGAFLSGTFI ATVGIDFRNK VVTVDGVRVK LQIWDTAGQE RFRSVTHAYY 
101 RDAQALLLLY DITNKSSFDN IRAWLTEIHE YAQRDVVIML LGNKADMSSE 
151 RVIRSEDGET LAREYGVPFL ETSAKTGMNV ELAFLAIAKE LKYRAGHQAD 
201 EPSFQIRDYV ESQKKRSSCC SFM (SEQ ID NO: 2) 



FEATURES : 

Functional domains and key regions : 

[1] PDOC00001 PS00001 ASN_GL YCO S YLAT I ON 
N-glycosylation site 

114-117 NKSS 

[2] PDOC00004 PS00004 CAMP_PHOSPHO_SITE 

cAMP- and cGMP-dependent protein kinase phosphorylation site 

Number of matches: 2 

1 214-217 KKRS 

2 215-218 KRSS 

[3] PDOC00005 PS00005 PKC_PHOSPHO__SITE 
Protein kinase C phosphorylation site 

Number of matches: 5 

1 29-31 TGK 

2 113-115 TNK 

3 149-151 SER 

4 173-175 SAK 

5 212-214 SQK 

[4] PDOC00006 PS00006 CK2_PHOSPHO_SITE 
Casein kinase II phosphorylation site 

116-119 SSFD 

[5] PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

Number of matches : 5 

1 3-8 GTPGAV 

2 6-11 GAVATR 

3 39-44 GVGKTC 

4 52-57 GAFLSG 

5 57-62 GTFIAT 

[6] PDOC00017 PS00017 ATP_GTP_A 
ATP/GTP-binding site motif A (P-loop) 

36-43 GDTGVGKT 

[7] PDOC0057 9 PS00675 SIGMA54_INTERACT__1 

Sigma-54 interaction domain ATP-binding region A signature 
32-4 5 VMLLGDTGVGKTCF 



Membrane spanning structure and domains: 

Helix Begin End Score Certainty 
1 48 68 0.715 Putative 
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BLAST Alignment to Top Hit: 

>CRA|103000001517087 /altid=gi 1 10 94 6770 /def =ref I NP_0 6738 6 . 1 | RAB37, 
member of RAS oncogene family; GTPase Rab37 [Mus 
musculus] /org=Mus musculus /taxon=I0090 /dataset=nraa 
/length-223 
Length - 223 



Score = 425 bits (1081), Expect = e-117 

Identities = 209/223 (93%), Positives = 215/223 (95%) 

Frame = +3 



Query: 42 MTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLLGDTGVGKTCFLIQFKDGAFLSGTFI 221 

MTGTPGA DGEAPERSPP SP+YDLTGKVMLLGD+GVGKTCFLIQFKDGAFLSGTFI 
Sbjct: 1 MTGTPGAATAGDGEAPERSPPFSPNYDLTGKVMLLGDSGVGKTCFLIQFKDGAFLSGTFI 60 

Query: 222 ATVGIDFRNKVVTVDGVRVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITNKSSFDN 4 01 

ATVGIDFRNKVVTVDG RVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITN+SSFDN 
Sbjct : 61 ATVGIDFRNfCVVTVDGARVKLQIWDTAGQERFRSVTHAYYRDAQALLLLYDITNQSSFDN 120 

Query: 4 02 IRAWLTEIHEYAQRDVVIMLLGNKADMSSERVIRSEDGETLAREYGVPFLETSAKTGMNV 581 

IRAWLTEIHEYAQRDVVIMLLGNKAD+SSERVIRSEDGETLAREYGVPF+ETSAKTGMNV 
Sbjct: 121 IRAWLTEIHEYAQRDWIMLLGNKADVSSERVIRSEDGETLAREYGVPFMETSAKTGMNV 180 

Query: 582 E LAFL AI AKE LKYRAGHQADE PS FQ IRD YVE S QKKRS S CC S FM 710 

ELAFLAIAKELKYRAG Q DEPSFQIRDYVESQKKRSSCCSF+ 
Sbjct: 181 ELAFLAIAKELKYRAGRQPDEPSFQIRDYVESQKKRSSCCSFV 223 (SEQ ID NO: 4) 



Hmmer search results (Pf am) : 

Model Description 



PF00071 Ras family " 
CE00060 CE00060 rab_ras_like 

PF01142 Uncharacterized protein family UPF0024 



Score 



306.9 
213.3 
2.6 



E-value N 



8.4e-90 1 
3.7e-60 1 
3.4 1 



Parsed for domains : 
Model Domain seq-f seq-t 



hmm-f hmm-t 



score E-value 



CE00060 
PF01142 
PF00071 



1/1 
1/1 
1/1 



31 
185 
31 



191 . . 
201 . . 

223 .] 



25 
444 
1 



193 
4 62 
198 



213.3 
2.6 
306.9 



3.7e-60 
3.4 
8.4e-90 
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1 AGGGGAGAGA AAAGACCGCA TACCAGGCCA GGTGCGGTGG CTCACGCTTG 
51 TAATCCCAGC AATTTGGAAG GCCAAGGCAG GCGTATCGCC TGAGGTCAGC 
101 AGTTCCAAAC CAGCCTGTCC AACATGGTGA AGTTCTCTAC TAAGAATACA 
151 AAAATTACCC AGGCGTGGTG GCGTGCACCT GTAGTCCCAG CTGCTCCAGA 
201 GGCTGAGGCA GGAGAATTGC TTGAACCTGG GAGGCAGAGG CTGCAATGCG 
251 CCAAGATCCC GCCACTGCAC TCCAGCCTGG GCGACAGAGT GAGACTCCGT 
301 CTCCGGGAGC CCACGGCATT GAGCAAACCT CGGCATTATT TGCAGCAAGA 
351 GCCTCTGGCA TCCAAATAGC AACCAACACC ACGCCTCTGT AGTGTGCTGC 
4 01 GCAGCCTCCA CACTCCAGTC TGAGGCTCCC TGTTTGAGTC CCGCCCTATG 
451 CCCAGCTGAG GTTATAGCAC GCTCACCTCC AGAAGAGGTA ACCCAAGCTC 
501 TTTACTCTAC TGGAGATCAC CTCTGTCCCC ACTCTGGGCG CTTCTCCCAG 
551 CTGACAGAAA ATACCTCCAG CTGATGTCAG AAAATACAGG GCTGGAGGCT 
601 GGCGTACAAA GTCAGTCCCC ACAGGCCTAT GGTGGCCCAT AAGCCACGTC 
651 TACCCCTGCT CCTCACCTCC ACACCTAAGT TAAGAATTGC AGGCCGGGCG 
701 CAGTGGCTCA CGCCTGTAAT CCCAGCACTT TGGGAGGCTG AGGTGGGCGG 
7 51 ACCGCCTGAG GTCAGGAATT TGAGACCAGC TTGGCCAACA TGGCAAAACC 
801 CCGTCTCTAC TAAAAATACA AAAAGAAAAA ATAGCCGGGC CTGATGTCGC 
851 GCACCTGTAA TCCCAGCTAC TCCGGGAGAC TGAGGCGGGA GTATAGCTTG 
901 AACCCGGGAA GCAAAGGTTG CAGTGAGGCG AGATCGCACC ACTGCACTCC 
951 AGGCTGGGCG ACAGAGTGAG ACTCTGTCTG AAAAAAAAAA AAAGTGCAGG 
1001 TACCCCTCTC CAGCTCTCCC CTCCCTACAC ATCCCTCAAA CCGTCCCGCT 
1051 GTAATGCACC CGCCCTGTTC CTTGGTAACT TGAAGCTGCT TATAGAATGT 
1101 GGAGATGGGG GTAATTGAAA GGTCGGCCCA GGCCACAGAG CCCCTGAGCT 
1151 CTGCTACCGG CAACCCCAGC TGCACTCCCC ACTCTCTGTC ACCAGGAGCT 
1201 GCCGGGTGCC TGGGATATCC TGGCAGCTCT GCTCAAAATG ATCTACGACT 
1251 TCATGAATTT ATTTGGCTCC TCCTCGGGGC CAGGGTGAGT GTCATGGGTT 
1301 AATAAGGCCG GCCCCGCCTT CAGGAGCGGT CCACTGGGAG ATGTGTGCTG 
1351 CGCAGCCCTC TTGCGAAAGC TCTCCCCTGG TGGGACATTC TGGGCACAAC 
14 01 CAACAGGCCG GGGGAAATGA GAGGTGATCC ATACTAAAGG GTCAAAGTCC 
1451 CCGCACCAGG CAGAGGCCCC AAAACACCGC AGCGTACATG TGCTGCAAGG 
1501 CGAGTACGGG TTGGTAAACA AAACTATATT CAGATGAGCT CGGGCCGGGT 
1551 GACTTAACAG ATGAGGAAGT GTCTCGGGGC CATCGGCGGA GGCGCAGCCC 
1601 AGGGGTCCCC AGCTCCCCGC CTCGCCACCT GGGGACAGCC CACGGCCCGG 
1651 GGCTCGGGCG CCGCCTGCTG TCGCGGTGCG CAGCGACTAC GGGAACTCTT 
1701 CCGCAGCAGA CGGGGTCCCC GCGGCCCGCT CCCCCAGGGG CAAGCAAGCG 
17 51 ACCACAGGGG ACCGGTCCCG GGGCTGGATG TGGCTCATGT CCGAAGCGCA 
1801 CGGAGCCGAG CCGGTGTTGC TCAGGGAGGC TGCCCGCCCC TTCACGCAGA 
1851 CCCTGCGGCT CTGCGTGCCC TCAGGGAACA GCAAGGTCCG AGCCGGTGTC 
1901 GTCGAGGGGG CGACGGGACG GAGGGAGGAG CCTGAGGGGT CCCGGTCGAG 
1951 GGAGGGGAGG AGTGGGCGGG GCGGGGGTGG GGGCCGTTCC CGCGCTCTCC 
2 001 TTCGCCTGCG GGCCGGCACT GCTCACCTCT CGTCCAGGGA CATGACGGGC 
2051 ACGCCAGGCG CCGTTGCCAC CCGGGATGGC GAGGCCCCCG AGCGCTCCCC 
2101 GCCCTGCAGT CCGAGCTACG ACCTCACGGG CAAGGTGGGT GGGCCTCTTC 
2151 CGTGAGACCC CCGCCCTCCT CGGCGCTAGC CCCTTCCTGG CTGCGTCTGG 
2201 GTTGGACTCA GCCCTTCCCC CAGGCAGCTG CGTCTCCCAG AGGAGGGAGG 
2251 GAGAGAGGGT CAGGACACAG CCTCTGGGGC CGTCCCAAGC TCTAGGTGTC 
2301 TCTGCTGGCT TGGTGGGGGC GGGTCGCGGA AGATCGCAAA AACTGAGTGA 
2351 TCCCCCCGCC GGCCCCAACT CAGTTCTCTT CTGCCACACT CTGGCAAATA 
24 01 TGAGCCCCCG GGAGCCCATG CTTCTTGGTG AGGGTTAAGC GCGCAACTCT 
2451 CGGGGCTCAG GCTGGGAAGG GCTGGGAGAT GGGGACCGAA CGGAGACTCG 
2501 GAGAGGACGT CCCCTGCTGG CAGAGGAACT GGCGTTAATG CCATTTTCCG 
2551 AGCTAAGCTC TTAGTTGAGA TCTGACATCC AGGTTTAAGG CCTGATGTCC 
2 601 CCCAGCTGCT CCCCTCCCAT TCCACCCGCT GGAGGCACTG CCTCCCACCT 
2 651 TCCTCCCTGC AGTCGGAAGC CGCTCCTCCC AGAAGGATGT TGCCAGCCGG 
27 01 CCTGCAGGTC ACTTGGGAAT TTTTCGAACC TGAGAAAGAT TTCAGTGGTT 
27 51 GGTCTTTCGC ATCCCGCACT TGAGAGAGCT CCAGGGCTGC TCTCTGGGGC 
2801 TTGCTCCCTC TACAGGGGTG TCCTGTATGG AAACAGGTAG GGACAGCAGT 
2851 GGACTGGTCT GTCGCCTTCC ATCTGTGTCC TTGGAGTGAG CGGGTACCAG 
2 901 AAACTGAAAG AACTGCTGAG GGAGCCTAGA GCTTCCACTC TTCCTCTGCA 
2 951 GGGTTGGGGA TGGAGTGAGG GCTGTCCTGG ATTCCGCTGC ATGGCCTTGA 
3001 AGGAGACCTG CCTCTCTCTG GGCCTCGGTT TCCTCCCCGA CACCAGGGCT 
3051 CACCCTTGCT GGGAGCCTCA GCCTCCACCC CAGTGTTTCG GGGGAAGCCA 
3101 CCCTGCAAGT CATCCGCCCA GAGCCGTTGA GATAGGCGTC CTGTGTGGGC 
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3151 TTGTGGCAGG AAATGGGCCC CTGCACCCTC GGAGAGGAGG AGCTGCTGTT 
3201 GGCCAGGCCC CAGGCTGAGG GGGACTGCCT GACCTTGTTG CCCTGCAAAC 
3251 CAGCTGGGTT GTTTGCCTAG GAGGTGGCCA GGCTAGGCAG CTGTTTGTGT 
3301 TTGGTGGAAT CACCGAGCTG GGTGGGTAGC TGGCATCGTT TGCTCAAGGC 
3351 AGCTGTGATC TGTAAAGTAC ACAAAGACTG GCCCTCCCTC CCTCCTTCCT 
34 01 GCTCCAGGGC TGGGACCCAG GAGCCAGGGA GGAGTGCAGG CTCCAGAAAG 
3451 CTCCTATCCC CCACCCCTTC ATCTGTTCCC TGGCCAAGCG GCATTGGCCG 
3501 GAGAGTTGGT CCCCAGCCTC CCCGGGCCTG CCCCAGGGGA GTGAGTCCAG 
3551 GACCCTCTGA GAAAGCCTGG CAGGAGCTCC TTGGACCAGA CTAGGGGTGA 
3601 TGTGGCCCAC AGGCAGACAG TTCCCACCCT GGGCCACTCT TCCCTGGGTC 
3651 TTAGGTGATT CACCACGATG ATGGGCCCTA GCCATTAACA GACTCTAGAA 
3701 ATACCTCAAA GACATTATCC CTCCTCCTTC TACCCACTAT GGAAACCATG 
3751 CCACAGAAAG GTTAAGGAAT CTTCCTAAAG TCACACAGTA GGCCATTTAC 
3801 AAATCAAGAC CCATCCTTCA TACCCCTTCT GCTCAGCCAC CCCTGCCTCT 
3851 CCACCAGAGT TAACTAATGC CAGTACCCCA TGCCCACAAC AGGAATGCCT 
3901 TTGGGCTCCA CTGTCAATTT CAGAGCCTCA AAAATAATTC AAACCTAGTC 
3951 CCTGCTTAAC CCATTAAGCC ACCTAACCAG CAGCTGGGAA ATTCCAGCAT 
4 001 TGGATCTAGA CCCCTGTTAT CCAAGATTGG AGAACAGTGG GACAAAGTGC 
4 051 TCCTCTCCAC CATTCCTGCG TGTCCCTGGG GAAGATGAGC AGAGCAGAGC 
4101 CAGACAGTAA AGGAGAGGGC CACGCCCCCT CCACAGGTTA CCTCCTTGGT 
4151 ACTCCTGCCC GCACTACCCA CAGCAACCCC GGGATGCCGA TCTGCAGCCA 
4201 CATGTCCCAT GTGGGAGGTT TCTGCTGAAA GAACTTCCAA CTACACATCT 
4251 CCCCACTTCA GTATAAATTT CAACCTTCCC TAATTCATGC AACCTTTTTT 
4301 TTTTTTTTTT TTTTTTGAGA CAGAGTGTCG CTCTGTCACC GAGGCTGGAG 
4351 TTCAGTGATG CAATCTCGGC TCACTGCAAC CTCTACCTCC TGGGTTCAAG 
4 4 01 CTATTCTCCT GTCTCCGCCT CCCAAGTAAC TGGGACTACA GGCGTGTGCC 
4 4 51 ACCACTCCTG GCTAGTTTTT TGTATTTTTA GTAGAGATGG GGTTTCACCT 
4501 TGTTGGTCAG GCTGGTCTCA AACTCCCAAC TCAGGTGATC CGTCCACTTG 
4551 GGCACCCAAA ATGNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4 601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4 651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
4701 NNNNNNNNNN NNNNNNNNNN TTCAAGTACC AGCCTGGCCA ACATGGTAGA 
4 751 AACCCCGTCT CTACTAAAAA TAAAAAATTA GCCAGGCGAG GTGGTGCATG 
4 801 CCTATAATCC CAGCTACTCA GGTAGGCTGA GGCAGGAGAA TCATTTAAAC 
4851 CTGGGAGGTG GAGGTTGTGG TGAGCCAAGA TCTCGCCATT GCACTCCAGC 
4 901 CTGGGCAACA AGAGCAAAAC TCCGTCTCAA AAAAAAAAAG AAAGAAAGAA 
4 951 AGAAAGAAAC TTCCAAATAA ATGTTGTGAC ACAAAAAAAA AAACCCAAAC 
5001 AATATTCATT ATAGAGTATG CAAATGACCA TGCCCCACCC CCAGCAGATT 
5051 CTGATAGACT CCCTTGGGTG GGAATCCTTG TCCAATATAT TGACACTTCC 
5101 CTTTCCTGTC AGTATAGCCC AGCCCATGCG TGTACTCACG AGCGGACGAT 
5151 GGATGACACA AGTACACAGA GGGACGGAAT CCCTGCATGG TGTGGCTATG 
5201 GGCAAATGTG GCCACTGTCT AGATTGTGCA AATGTGGTGG TTCTCTGGGG 
5251 CCACAGAGCA CACTTGGGGA CCTGTTCATG GTGAGGTCTC AACTCCGGCC 
5301 TCTAGGAACT TGAATGAGGA CAGGAGGGTC AGAGGGAGAG CCTAGGAGGC 
5351 TGAGCCAAGG AGCGTGGAGA GGAGAGACAG GGTGAAGGTG GCGGCTGGCT 
54 01 TTCTGGAAGC AGGTGGCCTT TGGTGCGGTC AGCATTCGTG CCAGCCCCCT 
5451 CTTCTCTGAT CCTCTCCATG TGTCTCTCTC CTGGAATCCC AGAAGCTGCC 
5501 CCTGACTCCC CATTAACTGC CTCTGCCCCT ACCCCCTAGG TGATGCTTCT 
5551 GGGAGACACA GGCGTCGGCA AAACATGTTT CCTGATCCAA TTCAAAGACG 
5601 GGGCCTTCCT GTCCGGAACC TTCATAGCCA CCGTCGGCAT AGACTTCAGG 
5651 GTGAGGTGGC TGCAGGCACT TGCTTCCAGC AGAGAGCCAG GGCTGTGGCT 
57 01 CAGGCATGGG GGGGTTGCCC CCACCTTGCT CACCCTGGCT CCCAGGGACT 
57 51 CCCGAGGCTC ATGCCTGGAG GGCACACAAC CCGCTCCCCC AAGACCACAG 
5801 AGGTGGCCGG GTCAAAGGAG ACTGGGCAAG GTTGGCTCCT TGCCCAACTA 
5851 TAGGATGCAA AAAAATGAGA CTGAGTCTTC GATTCCAGCT CCATTCCTGG 
5901 GGGACTTCTC CCAAGCAGAG CAGCCGCAGG CACGGCATAA GCTGAATATC 
5951 TTGGCCCACA GAGCCCCTGC TCATTGCTCT CCTACCTGGG CCCCTTTGGA 
6001 AAGGCCTCAA AGGTCAATCA GTCTTTCTGG AGTTCCCAGA AAGCACAGCC 
6051 CTGCACTGGG TTTAAGAGCT GGGCTTGGGC CAGGCATGGT GGCTCTTGCC 
6101 TGTATTCCCA GCACTTTGGG AGGCCGAAGC GGTCAGATCA CAAGGTCAGG 
6151 AGTTTGAGAC CAGCCTGGCC AACATGGTGA AACCCCGTCT CTACTAAAAA 
6201 TACAAAAATT AGCCAGGTGT AGTGGCACGC TCCTGCAGTC CCAGCTACTC 
62 51 GGGAGGCTGA GGCAGGAGAA TCGCTCAAAT CCGGGTGGTG GAGGTTGCAG 
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6301 TGAGCTGAGA TCGCGCCACT GCACTCCAGC CTGGGCAACA AAGTGAGACT 
6351 GCGTCTCAGA AAAAAAAAAA AAAAAAGAGC TGGGCTGGCC ATGTTGGGAG 
6401 ACAGCAGCTC ACCAGGGACC CTCCCTCTCA CCTTGACGAC TCCATCTTAC 
64 51 AAATCTGCAT CAGGGATGCT AGACGCTGCA CACCTGAAGT GTTCAATAGA 
6501 GAAAAGGTCT CACCCTGGCA GGTGGGGCTC TACAGCTTCA AGCAGGCAGA 
6551 AAGCGAACAC TTCCTTCACT AGAGAATTAG TGGGCAGCTA AAGAAAAGGT 
6601 GCTGCTGCAG ATGTAGCCTC AGGTCCCCAG GATGCAGGCA AACACCCCAT 
6651 CTCCAGGGGC TCGGTCACAG TCCCAAGGCT AGGCTCCAGG AGAGGGAGAC 
6701 CGAAGTGGGG AAAGGGCAGG GCCTCCAGCA GCAACCAGCC CTCCAGCCCT 
67 51 GGGCTGCCTG ATCCCTGGAG AGAGCCAGGA TGTTTCTCAG GCTCCTCTTG 
6801 CCCTGCTGTT GTGAGAAGGC AGTTACAGTC CTCAGAAGGG ACGACTCCAC 
6851 AGTGGAGGTG TCTGGGTATG GGGTTCCTGC TGCCCTGATG GTATGATCTG 
6901 GCTGGAGACG GTTCTGGGGC TCACTGCACC CACTCTAGGC CTGGAGAGGG 
6951 AACAAGAGAG GACGTCTGCA GAGCTGAGGA GCCACATGAC TCCTGCCCTC 
7001 CCATCCTCTG CCTTTTTCTC TTTCAGAACA AGGTGGTGAC TGTGGATGGC 
7051 GTGAGAGTGA AGCTGCAGGT GAGACCAGAG GCTGGAGTTG GGGAGGGAGG 
7101 ATGGAGGACC TGCCCTTCCT TCTCACCCTG AACCACAGGA GGCCTGCAGC 
7151 CCTGCCCTCC GCCTGGGGCA ATTTCCTGTG GGGCCCACGG GAGGAAATGG 
7201 CTTTTGTTTA TTTGACATCT GCAGAAAAAG CAGTTCCCAG GCACCCTCTC 
7251 ATCTATGAAC AGCAGCTCCA AATGCCTTCA GACAAGCTTA GCCTCCATCC 
7301 ATCTCCTCCC CAGTTGCCAG GGCTTTATCT GCTCTTAGGA GATTGGACAT 
7351 CCCCAACCCC TGAGCTAGGG GAGAGGAGAA GATTCTTTTT TTTTCTTTTC 
7 401 TTTTCTTTTT TTTTTTGAGA TGGAGTCTCG CTCTGTCGCC CAGGCTGGAG 
7451 TGCAGTGGCA CAATCTCGGC TCACTGCAAC CTCTGCCTCC CAGGTTTAAG 
7501 AGATTCTCCT GCCTCAGCCT CCTGAGTAGC TGAGACTACA GGTGCATGCC 
7551 ACCACACCTG GCTAATTTTT TGTATTTTTA GTAGAGACGG GGTTTCACTG 
7 601 TGTTAGCCAG GATGGTCTGG ATCTCCTGAC CTCGTGATCC GCCTGCCTCG 
7 651 GCCTCCCAAA GTGCTGGGAT TACAGGTGTA AGCCACCGCG CTCGGCTGAG 
7701 GAGATGATTT TGAACGAGCT TGAGAAATCA GTAACTGCTA CTGTCCAGGT 
7751 CATTGGATGC TCAGGGGCTC ATGAGAACCT AAAGAAGAAA ACAGCCCCAC 
7801 CTTCCCACAG ATATCTCATA CAACAAAGCA GGCCTGCTCC ACCCAGCACA 
7851 TTCCTTGCAC CTGCCTCCTT CTGACCATTT CTCCATCCCA TCCCTTCCCA 
7 901 GATCTGGGAC ACCGCTGGGC AGGAACGGTT CCGAAGCGTC ACCCATGCTT 

7 951 ATTACAGAGA TGCTCAGGGT GAGTCCCTCG CACCCTCCAA CCCCTACCCC 
8001 AGCCCCTTGG TAGCATCCGT GCTGCTGCCT AAGTCCCCTC TGTGATCCTC 
8051 TCCCCTCCAG CCTTGCTTCT GCTGTATGAC ATCACCAACA AATCTTCTTT 
8101 CGACAACATC AGGGTAGGTC CTCCCTTCCC CTGACTCCCA CCCATAAGCA 
8151 GCCAAGGCAA GGTCTATGCA GGCTGGGGTT GCTTCCTGCC CTGTGGAAAG 
8201 CGGGTGGAGC GTGGAGTCCT CCTGCCTTCT GAAAAACACC TACTTGTGAC 
8251 TCAGAAGTCA TATCTGCTGC TTTGTATTTG GTGGCCATGT GGGCATGAAG 
8301 GCCAAGCAGG CTGTTGTGAC CCTGTGCCAC CTGCATAGCC CTCACTGTGA 
8351 TTCACGAGTG TGTTTCGTGA CAAAGTGTTC AGAACAGCCC CCACTCCACC 
84 01 CTGGATAATT AT CC AC AG AG ACCAAGGGAA AAACACAACC AGAAAAGTCC 
84 51 ACACATACAT CCAGGGCAAG TTGCAAGAAA GTGACTCAGT CAGACAGAGT 
8501 GAGTGGTTGT ATCCTCACAA CCAAACTATT ATAGAGACAA AAATTTGATA 
8551 AATTCAAGCA CCAATTTTGT TCACGACATT GTATAGGTTT CATGAATCCC 
8601 CTGACCTCAA GGACAGTTTG CTGATAAGCA AACTAGGAGA ATAAAACGTT 
8651 TATATAGAAA GAGGAAAATC CATGGCACTC ATACTCCTAC CTCCAACCCC 
8701 ATGCTCATGG CAGACATCAC TAATCAATCA CAGTACTTTT GATCACTGAA 
8751 ACCCTTATGT GGTCTTAGAA TCTTTAACAG GACACT CCAA GAAATCACTG 
8801 CTGACAGCCA ACTGATTTGT GAGATAAGGT CTCCATGCAT CTGGATCTTC 
8851 CATAGAACTG ATAGTTGCAC AGCATAAAAT GGTGAGGGTG GGGCCATTGT 

8 901 GGGTTGAGCC ACCAAGGAAG GCCATCCAGG CCTGGATGGG CCAGAACAAA 
8951 GGTACAGATG AGAGAACGCA CAGGGTATCG TGTTCAAGGT AGTGAGTAAC 
9001 TGAGGATAGT CAAACGGAGC AGAAGAAGAA AGGGGCAGCA GGAGGAAGAG 
9051 AATGCCAGTC TCGCACGCCC TCTCCCACAG GCCTGGCTCA CTGAGATTCA 
9101 TGAGTATGCC CAGAGGGACG TGGTGATCAT GCTGCTAGGC AACAAGGTGA 
9151 GTGGCTCCGG GGCAGGGTCA GCCCAGCCCT GCACTTCCTC AGCCCTAGCC 
9201 GGCCCCATAA CCACCCAAGA ACAGTTATCT AGGCATCCTT CCTGAAAAGG 
9251 ACTCTGCAGC CTCCAGCTCA GGGGTCAGAC ATATCTGGAG GCTTCTGCCC 
9301 ATCCCATCTG CCCCTTCCAG GGAAAGTCCA AGTTGTTGCC TGAGAAATCA 
9351 AGGGGTGCCC AGTTCTCAGC CCCCATTAGA GCAGAGTGAA CAGGGTCCCA 
94 01 GGTCAGGGGC TAAGAGTGCA AAGGGTTAGC CCCAACTGCT GTCCTATTCC 
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94 51 AAGACCCTTT ACCAAAGGTG 
9501 GAAACCCTGG CCCCAGGCCA 
9551 CAGCAGAGGG CAGGCAACGC 
9601 GGGCGGAGGC CTCCTTCCCC 
9651 TATGAGCAGC GAAAGAGTGA 
9701 GGGTAAGTGA TTGTCTGTGG 

97 51 CTGGCCCTGA GGACACTCTC 

98 01 GGAGACCAGC GCCAAGACTG 
9851 TCGCCAAGTG AGAGCTGGGC 
9901 ACACTCCAGG AATCCAGTAG 
9951 ACCTGCATTC TGCAGGCTGA 

10001 TGGGAGAGGG GAGGGGGCGG 
10051 CAGCCCAGCC CATTGTCTCT 
10101 CATCAGGCGG ATGAGCCCAG 
10151 GAAGAAGCGC TCCAGCTGCT 
10201 AGGAGGCTCT GGAGGCACAC 
10251 CTTATTCCAA GAGGCTGAGC 
10301 ACAGCCGCTT CCTAGCAGGG 
10351 GCGGTCTCCC CGCATCCACA 
104 01 ATAGTACATA ATTTAATACC 
104 51 AGGCTGGGAG CTAGTGGCCC 
10501 CTCCCTCCTA AGCATAACAA 
10551 G AC AC AG AT G CACTTTGGGG 
10601 TCAGTTCAGC TGGACAGAGG 
10651 GCTCTCCAGG AGCTTATCTT 
10701 TGCTGTGAGG AAGACCAAAG 
10751 GAGGGGGAGG ACAAGGGGCA 
10801 CTTACTAACA CCCCCCTGGA 
10851 CATTGGGGCA CCTGGAAATA 
10901 AGATCCTGGG GGAGCCCCTC 
10 951 CCTGCCTGTG CACCTAAAAA 
11001 TTTTTATTTT TTTGAAATGG 
11051 AGTAGTGCAA TCTCCGCTCA 
11101 ATCCTCCCAC CTCAGCCGCC 
11151 CACACCTGGC TAATTTTTGT 
11201 TGCCCAGGCT GGTCTTGAAT 
11251 CCTCCCAAAG TACTGGGATT 
11301 TGTGTCTTAT CCCAATCCTT 
11351 TTCAAGCAGC TGAAGTGTTT 
11401 GAAATCCCTT TCCTAGGTTT 
114 51 CTGGCTCCTG GAGCTGGTGG 
11501 TGTATCCCTT TGACCCCAAG 
11551 TTCCTTTAAC TTCTCAAACA 
11601 CTCCTACTGA GTCAGGTTAG 
11651 ACCAATGCAA TATGAGTAAA 
117 01 GAGAGGGGTA GCAAGTTCAT 
11751 CTCTGATCCC TGCCATGGGA 
11801 CAGGCATCTT TACTGCAGCT 
11851 AGAAGTATGC GCTGCACATT 
11901 GAAATCTTGC TCTGAGCAGT 
11951 GTTCTCCGTG GCCTCCCTCA 
12001 GTCTCTTCTG GGGCCCAAGG 
12051 GCTCTGGGCC TCTTGCATTT 
12101 GGAAAGCACA TCCTGTCCAC 
12151 TCTTGTTGAA GGAATTGTCA 
12201 CTCAGTGTCT GTCCTCAGAT 
12251 CGCCTCTGTG ACCAGCCTCT 
12301 AGTCCTAATC ATTTTTCTGT 
12351 TTCCCTGTGG CGTGTACCCA 
12 4 01 TCCCCACCCC ACCTCCCTAG 
124 51 GCCCTAGATG GCCTGTCTTC 
12501 CCGCGGAGCC CTCACCTTCC 
12551 GCGAGTCCTG GGACAGCACA 



AGATCCCAGA GCTGGGAGCT ACACTGGGCA 
ATCACACCTG CCTGCAGTCC CTTGGGCCAC 
CTGCTTCTGG GGCAAAATAT GGGCCCGCTG 
AGAGTGACCC ATTTGGGCTT GACAGGCGGA 
TCCGTTCCGA AGACGGAGAG ACCTTGGCCA 
GACAGGGTGA AGGGTGGGGG CAACCCGACG 
TCCCGGGCAG GAGTACGGTG TTCCCTTCCT 
GCATGAATGT GGAGTTAGCC TTTCTGGCCA 
AGGGAAGGGA AGTGTGCGGG GCAGGGCGGC 
GGCCCGGCCC CTGGCCCAGC CCCTGGACAC 
GGTCCATTTG CTCTGGGAGC ACTGGGCCAC 
CTCAGCTCCT CACCCCAGCC CAGCCCAGCC 
TCTTCAAGGG AACTGAAATA CCGGGCCGGG 
CTTCCAGATC CGAGACTATG TAGAGTCCCA 
GCTCCTTCAT GTGAATCCCA GGGGGCAGAG 
AGGATGCAGC CTTCCCCCTC CCAGGCCTGG 
CAATGGGGAG AAAGATGGAG GACTCACTGC 
AGCTATACTC CAACTCCTAC TTGAGTTCCT 
GGGAGGGTAA AACACTTAGC TTTTATTTTA 
AAAAAAGGCG CCTGGATCCC CAAAAAACCG 
TTTTGCTTTC TAGGACTTGG GGGGCCGGCC 
AGGTGGTGTT GCTCCAGCTC AGCCCCAGGG 
GTGAGGGCAG GTAATGACTC CATCGCACCC 
CTCAGGTGAC CCCAGCCTTC ACTGTCTCCC 
CGCCCCATCT CCCAAATAAG TGGGCCCTTG 
CCTCAGGGAA GATAAGAGAT ATGGAGATGG 
GAGAGTAGGG TCTAGCTGGC TATCTCTGGC 
GGCATGCCCC TTTTCTCCAG CACACAAGCA 
TTGGTTCCAG GCTCCTGTTC TCTGGACTTC 
CCCCCCCTGA ATCCCTGGCT TAGCTACCTT 
CCTCAGGTCA GAACTAGGAA AAGAGTTTTG 
AGTCTCGTTC TGTCGCCCAG GCTGAGGTGC 
CTACAACCTC CACTCCCTGG GGCTCAAGCG 
GAAGTAGCTG GGACTATAGG TGTGTACCAT 
ATTTTTTGTA GACACAGGGT TTCGCCATGT 
TCCTGAGCTC AAGCAACCTG CCGGCCTCGG 
ACACGCAGAA GGCACCATGC CCAGGCTAGA 
TGGCAGGCAT GCAGCTCCAC AGGCGATTTC 
AGCCCTCCTG GGTTAAGAGC CAGATAAGGA 
GGAATGTGTT GTGAAAAAAA AGAGAAATCC 
GAGACAAGAT TAAGCAAACC TCCCCTGACA 
CTCTGCCTCC TCCCTGACCA CCCATGCCCT 
GATACCAGGG CCTAAACTGC TTTACCTCCC 
GTGGTGGGAG GTCACCCATT TCCGAGTTAA 
ACAAAGTCAT GTGGGTATGT CTGGGGTAGA 
GTGTCCTCCT TGGTCACATA TCTCCCAAAG 
AGTGGACAGG AAACATGAGG TCATGACCTG 
CTGCCGGCCT GGAGGGGGAG AGGGGGAGGA 
TCTGAGGCTA CTGCATTTGC TTTCAAGGCA 
CAGCGGCTCC AGTTTGGGCC CGATAAGGAA 
GGCAGAGCAG GGAGGAGGCT GACATTGCCA 
CAGGTTGCAG GAGATCCAAT CCCATAGACA 
GAGTTTTTCA GAATTAAACT GCAGTATTTT 
TGTTTCTTTG AAGTGAGTGG GGGGGGGGGG 
TTCACTGCCA AAATCATTCC ATCCTCCTTC 
GGTCAGCTCC CCGCTCAACA GACTGTCTCC 
CTTTGGCAAG AGGGAGCTAG AAGGCTTTAC 
TGGAAAAAAA AAAAAAAAAC CAAGGCTCCT 
GAGGTTGATT ACCTGAGTCT GTCCTGCCTC 
CCAAACGCTG CTGCCAAAGC CCACGCTATT 
AGCGGGCTGC CCCTCGAGGT CCCAGGCTCT 
CAGCAGGGAT CAGAACCTGC ACTCCTCTAT 
AAGTGGATTA GGGTTAGGGT TCCCACAAAC 
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12 601 GGAAAAATGT TATTCAAACA ACTCTGTAGG GTCCGAGGAG GCCCTCCGTC 
12651 TTAATTCTCG AGACTGACCG GCCCTCGCTG CCCCGAGCGG GAGCAGTTGC 

127 01 CCCGGCAACA GCCGCTCCCT CTCAACTGGA GCTGCACCCA GGCTTTGGCT 
12751 AAAGGCTGTT AAAACGTTGG CCAGGTGCGG AGGCTCACGT CTGTAATCCC 

128 01 AGGGCGGATC ACCTGAGGTC AGGAGTTTGA AACCATCCTG GCCAACATGG 
12851 CGAAATTTCG TCTCTACTAA AAATACAAAA ATTAGCGGGG CGTGGTGGTG 
12901 CGCGCCTGTA ACCCCAGCTG CTCGGGAGGC TGAGGCAGGG GAATCGCTTG 
12951 AACCCGGGAG GCGGAGGTTG CAGTGATCCG AGATCGCGCC ACGGCAGTCC 
13001 AGCCTGGGCG ACAGAGCGAG ACTCCGTCTC AAAAAAAAAA AAAAAAGTTA 
13051 GGGTCCTTTA CCCGAGGGCC GGCTTTCCTC ACTCCCCGCC ACAGGTAGGG 
13101 GAAACCAGGC CGGAGCCGGC GGGCCCACCC GCCCAGAACC GGGAATTCGG 
13151 CGAGCCCCGC CCCTGCCACC CCAGCGCCGG CC {SEQ ID NO: 3) 
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Context : 



DNA 

Position 

4 259 ACCCATTAAGCCACCTAACCAGCAGCTGGGAAATTCCAGCATTGGATCTAGACCCCTGTT 
ATCCAAGATTGGAGAACAGTGGGACAAAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTG 
GGGAAGATGAGCAGAGCAGAGCCAGACAGTAAAGGAGAGGGCCACGCCCCCTCCACAGGT 
TACCTCCTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGC 
CACATGTCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTT 
[C,T] 

AGTATAAATTTCAACCTTCCCTAATTCATGCAACCTTTTTTTTTTTTTTTTTTTTTTGAG 
ACAGAGTGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAA 
CCTCTACCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTAC 
AGGCGTGTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACC 
TTGTTGGTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAA 

4325 GATTGGAGAACAGTGGGACAAAGTGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAG 
ATGAGCAGAGCAGAGCCAGACAGTAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTC 
CTTGGTACTCCTGCCCGCACTACCCACAGCAACCCCGGGATGCCGATCTGCAGCCACATG 
TCCCATGTGGGAGGTTTCTGCTGAAAGAACTTCCAACTACACATCTCCCCACTTCAGTAT 
AAATTTCAACCTTCCCTAATTCATGCAACCTTTTTTTTTTTTTTTTTTTTTTGAGACAGA 
EG, T] 

TGTCGCTCTGTCACCGAGGCTGGAGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTA 
CCTCCTGGGTTCAAGCTATTCTCCTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGT 
GTGCCACCACTCCTGGCTAGTTTTTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTG 
GTCAGGCTGGTCTCAAACTCCCAACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG 

434 8 TGCTCCTCTCCACCATTCCTGCGTGTCCCTGGGGAAGATGAGCAGAGCAGAGCCAGACAG 
TAAAGGAGAGGGCCACGCCCCCTCCACAGGTTACCTCCTTGGTACTCCTGCCCGCACTAC 
CCACAGCAACCCCGGGATGCCGATCTGCAGCCACATGTCCCATGTGGGAGGTTTCTGCTG 
AAAGAACTTCCAACTACACATCTCCCCACTTCAGTATAAATTTCAACCTTCCCTAATTCA 
TGCAACCTTTTTTTTTTTTTTTTTTTTTTGAGACAGAGTGTCGCTCTGTCACCGAGGCTG 
[G,A] 

AGTTCAGTGATGCAATCTCGGCTCACTGCAACCTCTACCTCCTGGGTTCAAGCTATTCTC 
CTGTCTCCGCCTCCCAAGTAACTGGGACTACAGGCGTGTGCCACCACTCCTGGCTAGTTT 
TTTGTATTTTTAGTAGAGATGGGGTTTCACCTTGTTGGTCAGGCTGGTCTCAAACTCCCA 
ACTCAGGTGATCCGTCCACTTGGGCACCCAAAATG 

4 924 TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
CTGGGCAACAAGAGCAAAACTCC 
[G,A] 

T C T CAAAAAAAAAAAG AAAG AAAG AAAG AAAG AAACT T C C AAAT AAAT GT T GT G AC AC AA 
AAAAAAAAACCCAAACAATATTCATTATAGAGTATGCAAATGACCATGCCCCACCCCCAG 
CAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTTT 
CCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGTA 
CACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGAT 

4 983 TTCAAGTACCAGCCTGGCCAACATGGTAGAAACCCCGTCTCTACTAAAAATAAAAAATTA 
GCCAGGCGAGGTGGTGCATGCCTATAATCCCAGCTACTCAGGTAGGCTGAGGCAGGAGAA 
TCATTTAAACCTGGGAGGTGGAGGTTGTGGTGAGCCAAGATCTCGCCATTGCACTCCAGC 
C T G G G C AAC AAG AG C AAAAC T C C GT C T CAAAAAAAAAAAG AAAG AAAG AAAG AAAG AAAC 
T T CC AAAT AAAT GTTGTGAC AC 
[-,A] 

AAAAAAAAAAC C C AAAC AAT AT T CAT T AT AG AGT AT GC AAAT G AC CAT GC C C C AC C C C C A 
GCAGATTCTGATAGACTCCCTTGGGTGGGAATCCTTGTCCAATATATTGACACTTCCCTT 
TCCTGTCAGTATAGCCCAGCCCATGCGTGTACTCACGAGCGGACGATGGATGACACAAGT 
ACACAGAGGGACGGAATCCCTGCATGGTGTGGCTATGGGCAAATGTGGCCACTGTCTAGA 
TTGTGCAAATGTGGTGGTTCTCTGGGGCCACAGAGCACACTTGGGGACCTGTTCATGGTG 

6710 CACCAGGGACCCTCCCTCTCACCTTGACGACTCCATCTTACAAATCTGCATCAGGGATGC 
TAGACGCTGCACACCTGAAGTGTTCAATAGAGAAAAGGTCTCACCCTGGCAGGTGGGGCT 
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CTACAGCTTCAAGCAGGCAGAAAGCGAACACTTCCTTCACTAGAGAATTAGTGGGCAGCT 
AAAGAAAAGGTGCTGCTGCAGATGTAGCCTCAGGTCCCCAGGATGCAGGCAAACACCCCA 
TCTCCAGGGGCTCGGTCACAGTCCCAAGGCTAGGCTCCAGGAGAGGGAGACCGAAGTGGG 
[A,G] 

AAAGGGCAGGGCCTCCAGCAGCAACCAGCCCTCCAGCCCTGGGCTGCCTGATCCCTGGAG 
AGAGCCAGGATGTTTCTCAGGCTCCTCTTGCCCTGCTGTTGTGAGAAGGCAGTTACAGTC 
CTCAGAAGGGACGACTCCACAGTGGAGGTGTCTGGGTATGGGGTTCCTGCTGCCCTGATG 
GTATGATCTGGCTGGAGACGGTTCTGGGGCTCACTGCACCCACTCTAGGCCTGGAGAGGG 
AACAAGAGAGGACGTCTGCAGAGCTGAGGAGCCACATGACTCCTGCCCTCCCATCCTCTG 

8 62 4 GTGCCACCTGCATAGCCCTCACTGTGATTCACGAGTGTGTTTCGTGACAAAGTGTTCAGA 
ACAGCCCCCACTCCACCCTGGATAATTATCCACAGAGACCAAGGGAAAAACACAACCAGA 
AAAGTCCACACATACATCCAGGGCAAGTTGCAAGAAAGTGACTCAGTCAGACAGAGTGAG 
T GGT T GT AT C C T C AC AAC C AAACT AT T AT AG AG ACAAAAAT T T G AT AAAT T CAAG C AC C A 
ATTTTGTTCACGACATTGTATAGGTTTCATGAATCCCCTGACCTCAAGGACAGTTTGCTG 
[A,G] 

TAAGCAAACTAGGAGAATAAAACGTTTATATAGAAAGAGGAAAATCCATGGCACTCATAC 
TCCTACCTCCAACCCCATGCTCATGGCAGACATCACTAATCAATCACAGTACTTTTGATC 
ACTGAAACCCTTATGTGGTCTTAGAATCTTTAACAGGACACTCCAAGAAATCACTGCTGA 
CAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATCTGGATCTTCCATAGAACTGATAG 
TTGCACAGCATAAAATGGTGAGGGTGGGGCCATTGTGGGTTGAGCCACCAAGGAAGGCCA 

8 661 TGTTTCGTGACAAAGTGTTCAGAACAGCCCCCACTCCACCCTGGATAATTATCCACAGAG 
AC C AAGG G AAAAAC AC AAC C AG AAAAGT C C AC AC AT AC AT C C AG G G C AAG T T GC AAG AAA 
GTGACTCAGTCAGACAGAGTGAGTGGTTGTATCCTCACAACCAAACTATTATAGAGACAA 
AAATTTGATAAATTCAAGCACCAATTTTGTTCACGACATTGTATAGGTTTCATGAATCCC 
CTGACCTCAAGGACAGTTTGCTGATAAGCAAACTAGGAGAATAAAACGTTTATATAGAAA 
[G,A] 

AGGAAAATCCATGGCACTCATACTCCTACCTCCAACCCCATGCTCATGGCAGACATCACT 
AATCAATCACAGTACTTTTGATCACTGAAACCCTTATGTGGTCTTAGAATCTTTAACAGG 
ACACTCCAAGAAATCACTGCTGACAGCCAACTGATTTGTGAGATAAGGTCTCCATGCATC 
TGGATCTTCCATAGAACTGATAGTTGCACAGCATAAAATGGTGAGGGTGGGGCCATTGTG 
GGTTGAGCCACCAAGGAAGGCCATCCAGGCCTGGATGGGCCAGAACAAAGGTACAGATGA 

117 54 GCTCCTGGAGCTGGTGGGAGACAAGATTAAGCAAACCTCCCCTGACATGTATCCCTTTGA 
CCCCAAGCTCTGCCTCCTCCCTGACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGAT 
ACCAGGGCCTAAACTGCTTTACCTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTC 
ACCCATTTCCGAGTTAAACCAATGCAATATGAGTAAAACAAAGTCATGTGGGTATGTCTG 
GGGTAGAGAGAGGGGTAGCAAGTTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTC 
[T,C] 

GATCCCTGCCATGGGAAGTGGACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACT 
GCAGCTCTGCCGGCCTGGAGGGGGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTG 
AGGCTACTGCATTTGCTTTCAAGGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTT 
TGGGCCCGATAAGGAAGTTCTCCGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACA 
TTGCCAGTCTCTTCTGGGGCCCAAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTC 

1183 6 GACCACCCATGCCCTTTCCTTTAACTTCTCAAACAGATACCAGGGCCTAAACTGCTTTAC 
CTCCCCTCCTACTGAGTCAGGTTAGGTGGTGGGAGGTCACCCATTTCCGAGTTAAACCAA 
TGCAATATGAGTAAAACAAAGTCATGTGGGTATGTCTGGGGTAGAGAGAGGGGTAGCAAG 
TTCATGTGTCCTCCTTGGTCACATATCTCCCAAAGCTCTGATCCCTGCCATGGGAAGTGG 
ACAGGAAACATGAGGTCATGACCTGCAGGCATCTTTACTGCAGCTCTGCCGGCCTGGAGG 
[A,G] 

GGAGAGGGGGAGGAAGAAGTATGCGCTGCACATTTCTGAGGCTACTGCATTTGCTTTCAA 
GGCAGAAATCTTGCTCTGAGCAGTCAGCGGCTCCAGTTTGGGCCCGATAAGGAAGTTCTC 
CGTGGCCTCCCTCAGGCAGAGCAGGGAGGAGGCTGACATTGCCAGTCTCTTCTGGGGCCC 
AAGGCAGGTTGCAGGAGATCCAATCCCATAGACAGCTCTGGGCCTCTTGCATTTGAGTTT 
TTCAGAATTAAACTGCAGTATTTTGGAAAGCACATCCTGTCCACTGTTTCTTTGAAGTGA 
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